1,354 research outputs found

    Tiny Corpus Applications with Transformation-Based Error-Driven Learning : Evaluations of Automatic Grammar Induction and Partial Parsing of SaiSiyat

    Get PDF
    This paper reports a preliminary result on automatic grammar induction based on the framework of Brill and Markus (1992) and binary-branching syntactic parsing of Esperanto and SaiSiyat (a Formosan language). Automatic grammar induction requires large corpus and is found implausible to process endangered minor languages. Syntactic parsing, on the contrary, needs merely tiny corpus and works along with corpora segmented by intonation-unit which results in high accuracy

    DEVELOPING AN ONLINE CORPUS OF FORMOSAN LANGUAGES

    Get PDF
    Information technologies have now matured to the point of enabling researchers to create a repository of language resources, especially for those languages facing the crisis of endangerment. The development of an online platform of corpora, made possible by recent advances in data storage, character-encoding and web technology, has profound consequences for the accessibility, quantity, quality and interoperability of linguistic field data. This is of particular significance for Formosan languages in Taiwan, many of which are on the verge of extinction. As a response to the recognition of this burgeoning problem, the key objectives of the establishment of the NTU Corpus of Formosan Languages aim to document and thus preserve valuable linguistic data, as well as relevant ethnological and cultural information. This paper will introduce some of the theoretical bases behind this initiative, as well as the procedures, transcription conventions, database normalization, in-house system and three special features in the creation of this corpus

    Constraints on the χ_(c1) versus χ_(c2) polarizations in proton-proton collisions at √s = 8 TeV

    Get PDF
    The polarizations of promptly produced χ_(c1) and χ_(c2) mesons are studied using data collected by the CMS experiment at the LHC, in proton-proton collisions at √s=8  TeV. The χ_c states are reconstructed via their radiative decays χ_c → J/ψγ, with the photons being measured through conversions to e⁺e⁻, which allows the two states to be well resolved. The polarizations are measured in the helicity frame, through the analysis of the χ_(c2) to χ_(c1) yield ratio as a function of the polar or azimuthal angle of the positive muon emitted in the J/ψ → μ⁺μ⁻ decay, in three bins of J/ψ transverse momentum. While no differences are seen between the two states in terms of azimuthal decay angle distributions, they are observed to have significantly different polar anisotropies. The measurement favors a scenario where at least one of the two states is strongly polarized along the helicity quantization axis, in agreement with nonrelativistic quantum chromodynamics predictions. This is the first measurement of significantly polarized quarkonia produced at high transverse momentum
    corecore